Totalsat Package Demo

Thomas de Mareuil - June 2020

This notebook presents how to use the totalsat package. For more details, do not hesitate to take a look at the code or documentation (automatically generated - code is better).

Table of Contents

Download and process public satellite imagery with Python

The totalsat library contains (for now) 3 modules: sat_download, sat_preprocess and sat_tools (utils module). We present in this notebook several examples and use cases.

Search for images

totalsat connects to Google Earth Engine to browse through public image collections. For now, you can access images provided by Landsat 5, 7 and 8 (Tier 1 and Tier 2 quality) and by Sentinel-2 (Level 1-C TOA and Level 2-A BOA). You can check image availability based on a list of criteria, passed as a Python dictionary.

Criteria can include:

Several other options are available, make sure you read the documentation or directly the code function for details.

With another function, we can plot the requested area on an interactive folium map. The layer control toggle allows to switch from map to satellite view and to visualize the area split.

Download images

The satellite images come with several spectral bands (13 for Sentinel-2, 9 for Landsat), in resolutions from 10m to 60m per pixel, as described below:
bands

You can use the retrieve_images() function to download images under the defined area of interest and save them as georeferenced .tif files (+ metadata in .txt files, in a separate folder). 1 folder will be created per satellite, under a specified project name (sitename) and path (filepath). The output of the function in the notebook is a dictionary containing the metadata of each downloaded image.

If you specify merge=True, the function also automatically removes potential image duplicates and merges overlapping images (if your area of interest is under 2 different satellite passes). Careful not to set this option when you limit image size, i.e. when you split a large satellite image into several smaller tiles - as tiles are taken from the same image, they could be merged together...

If you have already downloaded the images earlier, you can just reload the metadata files by running the sat_download.get_metadata(inputs) function, and you can move on to preprocessing.

Preprocess images

Each downloaded image contains several unprocessed spectral bands stacked on top of each other. The sat_preprocess module allows you to create an RGB image or other remote sensing indices (NDVI, BSI, etc. - 7 pre-coded indices available), as well as a custom combination of spectral bands.

The RGB preprocessing mainly consists in stretching the pixel values so that they cover the entire value range. The other indices are just about computing different band combinations. These methods are meant to be a quick hack, without histogram correction techniques or additional steps that would be required for a perfectly clean output.

Note: to avoid the RGB processing, you can directly download RGB images from the 'S2_RGB' satellite (Sentinel-2 Level-2A BOA reflectance), but these images tend to have more errors than 'S2' (Sentinel-2 Level-1C TOA reflectance) - more areas with black pixels, etc.

The preprocessed images are then saved in a separate folder as georeferenced TIF files (metadata is maintained). If you specify a 1-channel remote sensing index (ex: NDVI), it also saves a JPG file with a chosen colormap for visualization (default: matplotlib's "summer" palette).

We can check that the metadata from the newly written images was maintained (with rasterio), and display one of the images:

We can also compute the NDVI (Normalized Difference Vegetation Index, showing vegetation health) and NDWI (Normalized Difference Water Index, showing image water content), using different colormaps for visualizations (see list of possible colormaps here).

Last, but not least, we can compute any band combination using a custom formula.

Valid combinations are of the form: [ R 0.5 , NIR 1 , B * 0.7 ]. Make sure you add spaces everywhere for the parser to understand the formula. Coefficients correspond to pixel intensity (in %). The first band will be interpreted as Red channel, the 2nd one as Green channel, and the 3rd one as Blue channel. This way, we can experiment, use false colors, adjust color intensity, etc... let's get creative!

You can see more details and examples of "custom scripts" through this link.

Other totalsat tools and features

Tthe totalsat package also provides several utilities in the sat_tools module and additional features, as presented below.

Split images

The split_image function from sat_tools can be used to split any image in smaller tiles (remark: tiles are saved as PNG, which doesn't maintain metadata).

The number of tiles depends either on a specific nb_tiles argument or on a max_tile_size argument. Tiles follow by default the image aspect ratio, but can be set to square shape with the square argument. They are automatically numbered and saved either in a chosen filepath_out or in the folders defined by the inputs dict (from previous steps). The show argument allows to display a preview of the split.

We can finally just check one of the small tiles:

Label images

Finally, the annotate_images function from the sat_tools module, associated with %matplotlib qt linemagic command, allows us to directly label images (either polygon annotations for segmentation or boolean labels for classification) in a notebook through an interactive matplotlib console. The labels can then be used to train object detection or object segmentation (binary masks) models.

You can pass to the function either the inputs dictionary from previous steps, or the path to a folder with georeferenced images to label.

The polygon annotations will be saved in the same folder as the images, in .geojson and .pkl formats (both in pixel coordinates and in epsg:4326). Boolean annotations are saved as .csv.

Finally, you can use the plot_geometries function to check the polygon labels on an interactive satellite map. You can pass to the function either the list of polygon coordinates outputted by annotate_images, or a list of polygon coordinates ((lon,lat) format) extracted from a .geojson file, or a geodataframe with a 'geometry' column.

And boolean annotations:

Create timelapse video

The sat_preprocess.timelapse() function takes as argument a list of settings and saves a .mp4 in the Google Drive linked to the user's GEE account. Available images are Landsat 5 (03/1984 - 05/2013), Landsat 7 (04/1999 - present), and Landsat 8 (06/2013 - present).

Possible improvements include: 1) download video directly on disk, 2) add Sentinel-2 and MODIS, 3) add remote sensing indices timelapse, 4) merge sources if dates spans over several satellite periods, 5) add dates as text overlay. To create better timelapses with almost all these options, you can refer to the awesome geemap package.

Create gif

The sat_tools.create_gif() function takes as arguments an input folder with images, the image extension name and an output path to save a GIF file. Default duration is 200ms/image, but it can be adjusted with the duration argument.

Reverse geocoding

We can use some last totalsat functions to perform reverse geocoding on a set of images. With sat_tools.get_image_centroid, we can get the coordinates of the center of a georeferenced image, and with sat_tools.get_address we can then retrieve the corresponding address.

The available address information depends on OSM through the Nominatim API (missing data is replaced with a NaN), and the user can specify a custom address parsing (among: 'full_address_df', 'full_address_str' and one or several keys among ['country', 'state', 'county', 'postcode', city', 'road', 'house_number']).

We test these functions on the images downloaded earlier in this notebook:

Download OSM data

The last feature we present in this notebook is a high-level function to download OSM data. This feature is based on the osmnx package, and is just meant to simplify its usage with a high-level command - you just have to specify an OSM tag key (ex: 'building') and an area (either a point and radius, or an address and radius, or a polygon, or a place name), and the function returns a geodataframe containing the geometries.

For more complex OSM queries, you can use the online Overpass API (link) or direct URL requests.

Example with building polygons:

Example with road linestrings: